A cartoon shows a top-down view of rows of cubicle workers.
Image: Shutterstock

Underpaid Workers Are Being Forced to Train Biased AI on Mechanical Turk

Workers who label images on platforms like Mechanical Turk say they’re being incentivized to fall in line with their responses—or risk losing work.
March 8, 2021, 9:00am
On the Clock is Motherboard's reporting on the organized labor movement, gig work, automation, and the future of work.

Like many other workers on Amazon's Mechanical Turk platform, Riley does the grueling and repetitive work of labeling images used to train artificial intelligence systems. Last year, he was one of the more than 6,000 microworkers assigned to help train ArtEmis, an algorithmic system that analyzes the emotional impact of more than 50,000 artworks from the 15th century to the present.

But this process—which prompted microworkers to respond to artworks with subjective labels like “excitement”, “amusement”, “fear”, “disgust”, “awe”, “anger” or “sadness”—was highly skewed toward producing certain results. Often, these low-paid workers  on platforms such as Mechanical Turk are compelled to submit answers that fall in line with the majority—or risk losing jobs by deviating from the norm.

Advertisement

“To be honest, a lot of it felt very forced,” Riley told Motherboard. “There were many images that were just formless blobs or of basic objects. It was quite a stretch to come up with emotions and explanations at times.”  

Motherboard granted Riley and several other microtask workers anonymity because they feared retaliation and losing job opportunities on the platforms. Others approached for interviews cited non-disclosure agreements (NDAs) they had signed. 

While data annotations can themselves be affected by the individual unconscious biases of the workers, majority thinking can be perpetuated by employers or the platforms themselves, with the threat of a ban or rejection looming if peoples’ answers deviate too strongly from the majority. 

“If your answers just differ a little too much from everybody else, you may get banned,” said Sarah, who labels datasets for the Germany-based platform Clickworker and the Massachusetts-based Lionbridge. Sarah lives in a politically repressive country and finds the income from Clickworker essential to her livelihood. 

“I sometimes find myself thinking like, I think this is a wrong answer ... but I know that if I say what I really think I will get booted from the job, and I will get bad scores,” said Sarah. "And I'm like, okay, I will just do what they want me to do. Even though I think it's a shitty choice.” 

Advertisement

A paper published in February by researchers at Cornell, the Universite de Montreal, the National Institute of Statistical Sciences, and Princeton University, highlighted that “many of these workers are contributing to AI systems that are likely to be biased against underrepresented populations in the locales they are deployed in.” It further noted that “the development of AI is highly concentrated in countries in the Global North for a variety of reasons” (abundance of capital, well-funded research institutions, technical infrastructure). 

Riley said that the Mechanical Turk workers eligible to participate in the ArtEmis study “must be located in Australia, USA, Great Britain, or Canada,” among other requirements. Its output will therefore be dramatically skewed towards the Global North, and responses across the world may differ drastically. Previous studies have shown that 75 percent of Mechanical Turk workers are from the US. 

Mechanical Turk doesn’t stand alone in the market. Other platforms run by companies like Appen, Clickworker, IBM, and Lionbridge all rely on a similar business model. These platforms are known for recruiting low-pay remote workers, and have seen a boom in worker availability since the beginning of the pandemic. Many are expected to continue growing as they provide services for tech giants such as Google. 

Advertisement

When contacted by Motherboard, Clickworker confirmed Sarah's claims that the majority answers prevail when workers label data. “Usually tasks where answers have to be given are completed by three or more different Clickworkers (depending on the customer's request and quality requirements),” said Ines Maione, Clickworker's marketing manager, in emailed comments sent to Motherboard. “The answers are compared by the system automatically and the correct one can be ensured by majority decision.”

However, experts warn that by using these methods, platforms like Clickworker and Mechanical Turk are effectively encouraging conformity. 

“You're teaching people to be compliant,” said Dr. Saiph Savage, director of the Human Computer Interaction Lab at West Virginia University and co-director at the National Autonomous University of Mexico’s Civic Tech lab. “You're also affecting creativity as well. Because if you're outside the norm, or proposing new things, you're actually getting penalized.”

Some microworkers recognize the potential issues with majority decisions and deliberately take some time clarifying their answers despite recognizing the lack of financial incentives to do so. In some cases the median pay is less than $3 per hour, so there are incentives to complete as many jobs as possible in a short space of time. “[Mechanical Turk] are really good about including an ‘anything else’ section at the end of each survey,” said Robert, another employee based in the US. “I sometimes use this section to further explain my position. It does tend to extend your survey time and may not make sense on the financial end,” he added. 

Alexandrine Royer, an educational program manager at the Montreal AI Ethics Institute, recently highlighted the importance of regulating microwork, or what she termed “ghost work”, partly on account of this issue. In an article published by the Brookings Institute, she noted that digital workers on the Mechanical Turk marketplace can spend hours labelling pictures they deem offensive, based on their own judgement calls. “It is difficult to quantify the bias that creeps in due to workers' own predispositions and interpretations. Yet, these workers perform tasks that can be highly subjective and culturally-dependent,” she told Motherboard. 

Advertisement

Savage recognized the potential for deeply entrenched societal homophobia to impact data annotation tasks too. During an ongoing study around YouTube, her colleagues noticed that some videos were being censored that didn’t seem to violate any of YouTube’s terms of service. “It was basically videos that were related to LGBTQ+ content," she said. “YouTube doesn't ban LGBTQ+ content, but what was happening was that the workers that they had hired to oversee what content gets banned or not banned came from countries where being LGBTQ+ is considered against the law; those workers had their own biases that were influencing how they were labelling content.”

Sarah is often responsible for rating the "featured snippets" that appear in the box as the first answer at the top of search engines. Robert, another microtask worker, has been involved in a wide variety of tasks from “labeling the people in a park" to  "talking to chat bots in order to familiarize programs with human interaction,” to slightly stranger tasks such as “ranking photos of human stool samples,” he said. However, some jobs have been downright disturbing, and the data annotation occasionally “exposes workers to graphic and violent images and has been linked to cases of post-traumatic stress disorder,” said Royer. 

A 2018 report by the United Nations' International Labor Organization (ILO) on microtask workers highlighted the shockingly low pay, tedious nature of the work, and the apparent negative impact on future employment prospects. It calculated that across five different platforms, a worker earned as little as $4.43 per hour in 2017—not taking into account the unpaid “invisible” work involved such as searching for tasks, taking unpaid qualification tests, researching clients to mitigate fraud, and writing reviews, all of which can be very time consuming. “Median earnings were lower, at just $2.16 per hour,” the report stated. 

The compensation is “probably minimum wage if you work through [the tasks] fast,” said Riley, the ArtEmis task worker. “...I can not advocate enough that this is NOT a primary job and do not listen to ANYONE who says otherwise.” 

Workers who label data for machine learning projects can also have their assessments rejected by clients, which means they may not even be paid. “If a requester decides to reject your work, there is no way to contest this and have them make a fair ruling. This is completely up to the requester and you basically did their work for free if they decide to be dishonest,” according to one Mechanical Turk worker cited in the ILO report. 

“Rejecting work means that workers will have completed tasks for an employer, but they're not going to get paid for it,” said Savage. “Being rejected stays forever on the record of the worker. And so, for instance, I had a worker who mentioned that she was rejected [after] she did over 1,000 tasks for an employer.” The worker was completely unable to clear their record and lost subsequent jobs and future opportunities as a result, but had nobody to whom they could complain. However, Amazon has reputedly started to change this by creating reputation systems for the employers themselves.  

Amazon did not respond to Motherboard's request for comment. 

In the end, microtask workers are unlikely to see the products of their labor, even though automated systems are projected to boost rates of profitability by an average of 38 percent by 2035. “The labour of these unseen workers generates massive profits that others capture,” stated the February study. It specifically cited the organization behind the commonly-used  ImageNet database, which pays workers only around a median of $2 per hour—presumably not a massive incentive to provide detailed, nuanced, and reflective annotations. Ultimately, “a deep chasm exists between workers and the downstream product,” the study concluded. “The exclusion of those from communities most likely to bear the brunt of algorithmic inequity only stands to worsen.”

This Tool Shows How Google Changes Its Search Results Around the World

'Search Atlas' lets you see beyond the filter bubble that Google's algorithms have built around you.
July 8, 2021, 9:00am
Screen Shot 2021-07-07 at 12
Image: Search Atlas

Researchers have developed a search engine tool that shows what Google search results appear in different countries or languages, highlighting key differences in the algorithm between regions.

Search Atlas allows users to cross the borders of the "filter bubble" they live in, which Google creates using data on a person's location, language, search history. The tool was created as part of a study conducted by two doctoral students at Carnegie Mellon University and MIT.

Advertisement

The Google search algorithm already ranks and prioritizes certain websites in search results, whether motivated by the designers’ academic, political or financial interests. The creators of Search Atlas invite users to "reflect on how their online lives are conditioned by technological infrastructures and geopolitical regimes.”

“Search engines both reflect the world and remap it, determining the information that users see. We built Search Atlas to let you search beyond borders,” Katherine Ye, a doctoral student at Carnegie Mellon University who co-created the tool, recently tweeted.

Ye and Rodrigo Ochigame, a doctoral student from MIT, created Search Atlas using a third party data scraper to collect search data from Google. Viewing the side-by-side differences in search results shows how Google controls what kind of information users can see. 

Google’s massive data index makes it the first tool people turn to when trying to learn something new. “Its dominance is so unquestioned and even people who are thinking critically about technologies are ignoring this big elephant in the room that is search,” Ye told Motherboard. 

In one instance, the study references how the region of Kashmir is depicted with a solid border on Google Maps for users in India. But it appears with a dotted border near Pakistan when it’s searched in other countries, showing that Kashmir’s ownership is disputed.

Ochigame and Ye used Search Atlas to search for specific topics that could show the greatest geopolitical differences, such as God or how to combat climate change. A search for Tiananmen Square in the United Kingdom and Singapore pulled up images of soldiers and the famous Tank Man image, but a search in China only shared promotional, tourist images of Tiananmen Square. 

Ye described Google as a giant scraper, meaning it generates search results by scraping data from every single webpage to create its massive information index. But Google makes it extremely hard to research or audit its own data. 

The company's scale and monopoly on data also makes it so that there are few other options available for search engines outside of Google. As a possible alternative, Ye referenced DuckDuckGo, which does not collect users’ private data, but with a scale and level of efficiency that is considerably small compared to Google. 

“This is really helping people broaden their field of view,” said Ye. “It’s just a very small first step or one of several first steps, including building on work that other people have done, toward exposing this power, towards challenging it, building public knowledge and perception of how powerful these technological infrastructures are.”

Advertisement
Advertisement

Hackers Fool Facial Recognition Into Thinking I’m Mark Zuckerberg

Using a new technique, researchers say they can make AI systems misidentify people by adding small bits of data to the images.
June 24, 2021, 9:00am
A modified picture of the author wearing a blue sweater, with search results yielding images of Mark Zuckerberg
Image courtesy of A

An Israeli artificial intelligence company says it has developed a new technique that tricks facial recognition systems by adding noise to photos.

Adversa AI’s technique, announced this week, is designed to fool facial recognition algorithms into identifying a picture of one person’s face as that of someone else by adding minute alterations, or noise, to the original image. The noise tricks the algorithms but is subtle enough that the original image appears normal to the naked eye.

Advertisement

The company announced the technique on its website with a demonstration video showing that it could alter an image of CEO Alex Polyakov into fooling PimEyes, a publicly available facial recognition search engine, into misidentifying his face as that of Elon Musk. 

To test this, I sent a photo of myself to the researchers, who ran it through their system and sent it back to me. I uploaded it to PimEyes, and now PimEyes thinks I’m Mark Zuckerberg.

Adversarial attacks against facial recognition systems have been improving for years, as have the defenses against them. But there are several factors that distinguish Adversa AI’s attack, which the company has nicknamed Adversarial Octopus because it is “adaptable,” “stealthy,” and “precise.”

Other methods are “just hiding you, they’re not changing you to somebody else,” Polyakov told Motherboard.

And rather than adding noise to the image data on which models are trained in order to subvert that training—known as a poisoning attack—this technique involves altering the image that will be input into the facial recognition system and doesn’t require inside knowledge of how that system was trained. 

Advertisement

Adversarial Octopus is a “black box,” Polyakov said, meaning even its creators don’t understand the exact logic behind how the neural networks that alter the images achieve their goal.

Adversa AI has not yet published peer reviewed research explaining Adversarial Octopus. Polyakov said the company plans to publish after it completes the responsible disclosure process: informing facial recognition companies about the vulnerability and how to defend against it.

It's not the first time researchers have created methods for subverting computer vision systems. Last year, researchers at the University of Chicago released Fawkes, a publicly available privacy tool designed to defeat facial recognition. Shawn Shan, a PhD student and co-creator of Fawkes, told Motherboard that, based on the information Adversa AI has made public, its technique seems feasible for defeating publicly available recognition systems. State-of-the-art systems may prove harder, he said. 

The field is constantly evolving as privacy-minded researchers and purveyors of facial recognition compete in a cat-and mouse-game to find and fix exploits. 

The Adversarial Octopus technique could theoretically be put to nefarious uses, such as fooling an online identity verification system that relies on facial recognition in order to commit fraud. It could also be used by “hacktivists” to preserve some of their privacy while still maintaining social media profiles, Polyakov said.

But despite the rapid advances in adversarial attacks, the threats remain largely theoretical at this point.

“We’ve never seen any attack, basically” that has deployed advanced techniques like Adversarial Octopus or Fawkes to commit fraud or other crimes, Shan said. “There’s not too much incentive for now. There are other easier ways to avoid online ID verification.”

Advertisement
Advertisement

Musicians Are Dragging Spotify’s CEO For Funding A Military AI Company

Daniel Ek's $113 million investment in Helsing AI has drawn ire from some artists, who criticize the streaming platform for underpaying musicians.
Janus Rose
New York, US
December 3, 2021, 9:00am
Spotify CEO Daniel Ek wearing a black jacket and sunglasses.
Bloomberg / Getty Images

Posting Spotify Wrapped playlists has become an annual end-of-year tradition on social media—and in many ways the shareable listening stats are symbolic of the streaming platform’s dominance over the music industry.

But some artists have been less enthusiastic this year after a recent announcement that Spotify’s CEO has invested $113 million in Helsing AI, a military defense firm that claims to use algorithmic systems which “integrate data from infrared, video, sonar and radio frequencies, gleaned from sensors on military vehicles, to create a real-time picture of battlefields.”

Advertisement

The move has caused some artists to call for boycotts and bristle at the idea of the company’s streaming profits being used to fund military technology. Some are even offering free or discounted music on competitor platform Bandcamp in exchange for proof that people canceled their Spotify subscriptions.

“His actions proves once again that Ek views Spotify and the wealth he has pillaged from artists merely as a means to further his own wealth,” wrote Zack Nestel-Patt, an organizer with the Union of Musician and Allied Workers (UMAW), in a statement emailed to Motherboard. Nestel-Patt added that Spotify created software and AI that has eroded the music industry, and is now investing in similar technology that is to be applied on “battlefields” in order to, as Helsing’s website notes, “serve our democracies.” 

The funding comes from Ek’s investment company Prima Materia, which last year earmarked $1.2 billion for investment in European tech companies. 

As the COVID-19 pandemic shut down music venues last year, activists launched multiple campaigns aimed at helping struggling artists and demanding that streaming platforms offer them a fair share of their growing profits. UMAW’s “Justice At Spotify” called for the company to increase revenue share to $0.01 per stream and increase transparency. Fight For The Future also launched a petition after Spotify filed a patent for technology designed to eavesdrop on users’ speech and use emotional data to target ads.

While all streaming music platforms are all known for the minuscule royalties they give to independent artists, Spotify is a frequent target of criticism due to its prominence in the space. Critics have slammed the platform for its particularly stingy payouts and algorithmic pay-for-play schemes, which they say advantage large labels while exploiting smaller artists. 

Ultimately though, the problem lies not in Spotify itself but in the core business model of streaming, which is designed to gather data on users while profiting off artists’ work. And for many labels and musicians, removing their work from large streaming platforms is simply unfeasible—no matter how shady they may be.

“It was bad enough when big corporate record label CEOs were the gatekeepers of the music industry exploiting artists for our labor,” Evan Greer, a musician and deputy director of Fight For The Future, told Motherboard. “Big Tech CEOs actively building a surveillance-driven dystopian future is even worse."

Advertisement
Advertisement
facial-recognition

Man Wrongfully Arrested By Facial Recognition Tells Congress His Story

Robert Williams was arrested last year in Detroit after a facial recognition system misidentified him as a suspect.
July 13, 2021, 3:58pm
A screenshot of Robert Williams testifying during a Congressional hearing on facial recognition technology.

Michigan resident Robert Williams testified about being wrongfully arrested by Detroit Police in an effort to urge Congress to pass legislation against the use of facial recognition technology. 

Williams' testimony was part of a hearing held by the House of Representatives' subcommittee on crime, terrorism, and homeland security, which dealth with how law enforcement uses the highly controversial surveillance technology. Congress recently introduced the Facial Recognition and Biometric Technology Moratorium, which would indefinitely ban its use by law enforcement.

Advertisement

Williams was wrongfully arrested in 2020 for federal larceny after he was misidentified by the Detroit Police Department’s facial recognition software after they used a grainy image from the surveillance footage. He was then picked from a photo lineup by the store security guard who wasn’t actually present for the incident. According to his testimony, Williams was detained for thirty hours and was not given any food or water. 

“I don’t even live in Detroit and Detroit Police came to my house in Farmington Hills and basically carted me off,” he said in his testimony. “I don’t think it’s fair that my picture was used in some type of lineup and I’ve never been in trouble.”

Motherboard previously reported that the Detroit Police's facial recognition system also led to the false arrest of another man, Michael Oliver. Oliver is now suing the city, along with the ACLU.

Research has repeatedly shown that facial recognition technology is fundamentally biased against women and people of color, leading to errors like this. Even when working properly, privacy advocates have argued facial recognition systems disproportionately target communities of color, creating further pretext for police intervention.

“Large scale adoption of this technology would inject further inequity into a system at a time when we should be moving to make the criminal justice system more equitable,” Representative Sheila Jackson Lee (TX-18) said during the hearing. 

The subcommittee also referenced a recent study from the U.S. Government Accountability Office that reported that 20 federal agencies used facial recognition software last year. Six federal agencies, including the FBI and the U.S. Postal Service, reported using it during the 2020 Black Lives Matter protests that followed the police murder of George Floyd. 

Robert Williams is just one of many people impacted by this technology’s errors and biases. Detroit police misidentified another Black man, Michael Oliver, using the same software just the year before. 

Williams is now represented by the ACLU and is suing the Detroit Police Department for damages and policy changes to prohibit the use of facial recognition technology. So far, the technology has been banned statewide in Vermont and Virginia, as well as in 20 cities across the US.

“Mr. Williams deserved better from the law enforcement agencies entrusted to use a technology that we all know is less accurate when applied to citizens who look like him,” House Judiciary Committee Chairman Jerrold Nadler (D-NY) said in his testimony.

Advertisement
Advertisement